Crate git_repository[][src]

Expand description

This crate provides the Repository abstraction which serves as a hub into all the functionality of git.

It’s powerful and won’t sacrifice performance while still increasing convenience compared to using the sub-crates individually. Sometimes it may hide complexity under the assumption that the performance difference doesn’t matter for all but the fewest tools out there, which would be using the underlying crates directly or file an issue.

The prelude and extensions

With use git_repository::prelude::* you should be ready to go as it pulls in various extension traits to make functionality available on objects that may use it.

The method signatures are still complex and may require various arguments for configuration and cache control.

Easy-Mode

Most extensions to existing objects provide an obj_with_extension.easy(&repo).an_easier_version_of_a_method() or easy(&repo) method to hide all complex arguments and sacrifice some performance for a lot of convenience.

When starting out, use easy(…) and migrate to the more detailed method signatures to squeeze out the last inkling of performance if it really does make a difference.

Object-Access Performance

Accessing objects quickly is the bread-and-butter of working with git, right after accessing references. Hence it’s vital to understand which cache levels exist and how to leverage them.

When accessing an object, the first cache that’s queried is a memory-capped LRU object cache, mapping their id to data and kind. On miss, the object is looked up and if ia pack is hit, there is a small fixed-size cache for delta-base objects.

In scenarios where the same objects are accessed multiple times, an object cache can be useful and is to be configured specifically using the object_cache_size(…) method.

Use the cache-efficiency-debug cargo feature to learn how efficient the cache actually is - it’s easy to end up with lowered performance if the cache is not hit in 50% of the time.

Environment variables can also be used for configuration if the application is calling apply_environment() on their Easy* accordingly.

Shortcomings & Limitations

  • Only one easy::Object or derivatives can be held in memory at a time, per Easy*.
  • Changes made to the configuration, packs, and alternates aren’t picked up automatically if they aren’t made through the underlying Repository instance. Run one of the refresh*() to trigger an update. Also note that this is only a consideration for long-running processes.

Design Sketch

Goal is to make the lower-level plumbing available without having to deal with any caches or buffers, and avoid any allocation beyond sizing the buffer to fit the biggest object seen so far.

  • no implicit object lookups, thus Oid needs to get an Object first to start out with data via object()
  • Objects with Ref suffix can only exist one at a time unless they are transformed into an owned version of it OR multiple Easy handles are present, each providing another ‘slot’ for an object as long as its retrieved through the respective Easy object.
  • ObjectRef blocks the current buffer, hence many of its operations that use the buffer are consuming
  • All methods that access a any field from Easy’s mutable State are fallible, and return easy::Result<_> at least, to avoid panics if the field can’t be referenced due to borrow rules of RefCell.
  • Anything attached to Access can be detached to lift the object limit or make them Send-able. They can be attached to another Access if needed.
  • git-repository functions related to Access extensions will always return attached versions of return values, like Oid instead of git_hash::ObjectId, ObjectRef instead of git_odb::data::Object, or Reference instead of git_ref::Reference.
  • Obtaining mutable is currently a weak spot as these only work with Arc right now and can’t work with Rc<RefCell> due to missing GATs, presumably. All Easy*!Exclusive types are unable to provide a mutable reference to the underlying repository. However, other ways to adjust the Repository of long-running applications are possible. For instance, there could be a flag that indicates a new Repository should be created (for instance, after it was changed) which causes the next server connection to create a new one. This instance is the one to use when spawning new EasyArc instances.
  • Platform types are used to hold mutable or shared versions of required state for use in dependent objects they create, like iterators. These come with the benefit of allowing for nicely readable call chains. Sometimes these are called Platform for a lack of a more specific term, some are called more specifically like Ancestors.

Terminology

WorkingTree and WorkTree

When reading the documentation of the canonical git-worktree program one gets the impression work tree and working tree are used interchangeably. We use the term work tree only and try to do so consistently as its shorter and assumed to be the same.

Cargo-features

With the optional “unstable” cargo feature

To make using sub-crates easier these are re-exported into the root of this crate. Note that these may change their major version even if this crate doesn’t, hence breaking downstream.

git_repository::

  • hash
  • [url]
  • actor
  • [bstr][bstr]
  • objs
  • [odb]
    • [pack][odb::pack]
  • refs
  • interrupt
  • [tempfile]
  • lock
  • [traverse]
  • [diff]
  • [parallel]
  • [Progress]
  • [progress]
  • interrupt
  • [protocol]
    • [transport][protocol::transport]
      • [packetline][protocol::transport::packetline]

Re-exports

pub use git_actor as actor;
pub use git_hash as hash;
pub use git_lock as lock;
pub use git_object as objs;
pub use git_object::bstr;
pub use git_ref as refs;

Modules

Which Easy* is for me?

Process-global interrupt handling

Structs

A handle to a Repository for use when the repository needs to be shared, providing state for one ObjectRef at a time, , created with Repository::into_easy().

A handle to a Repository for sharing across threads, with each thread having one or more caches, created with Repository::into_easy_arc()

A handle to a optionally mutable Repository for use in long-running applications that eventually need to update the Repository to adapt to changes they triggered or that were caused by other processes.

A handle to a repository for use when the repository needs to be shared using an actual reference, providing state for one ObjectRef at a time, created with Repository::to_easy()

A instance with access to everything a git repository entails, best imagined as container for most for system resources required to interact with a git repository which are loaded in once the instance is created.

A borrowed reference to a hash identifying objects.

Enums

The kind of Repository

An owned hash identifying objects, most commonly Sha1

A repository path which either points to a work tree or the .git repository itself.

Functions

Type Definitions

The standard type for a store to handle git references.